Published on Jun 05, 2023
The memory hierarchy of high performance and embedded processors has been shown to be one of the major energy consumers. Extrapolating the current trends, this portion is likely to be increased in the near future. In this paper, a technique is proposed which uses an additional mini cache, called the L0-cache, located between the I-cache and the CPU core. This mechanism can provide the instruction stream to the data path, and when managed properly, it can efficiently eliminate the need for high utilization of the more expensive I-cache.
Cache memories are accounting for an increasing fraction of a chip's transistors and overall energy dissipation. Current proposals for resizable caches fundamentally vary in two design aspects:
(1) cache organization, where one organization, referred to as selective-ways, varies the cache's set-associatively, while the other, referred to as selective-sets, varies the number of cache sets, and
(2) resizing strategy, where one proposal statically sets the cache size prior to an application's execution, while the other allows for dynamic resizing both across and within applications.
Five techniques are proposed and evaluated which are used to the dynamic analysis of the program instruction access behavior and to proactively guide the L0-cache. The basic idea is that only the most frequently executed portion of the code should be stored in the L0-cache, since this is where the program spends most of its time. Results of the experiments indicate that more than 60% of the dissipated energy in the I-cache subsystem can be saved.